Scaling laws of strategic behaviour and size heterogeneity in agent dynamics 
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The dynamics of many socioeconomic systems is determined by the decision making process of 
agents. The decision process depends on agent's characteristics, such as preferences, risk aversion, 
behavioral biases, etc. Hi- I n addition, in some systems the size of agents can be highly heteroge- 
neous leading to very different impacts of agents on the system dynamics [3|, |4|, |a, [y, LO, [8( . The large 
size of some agents poses challenging problems to agents who want to control their impact, either 
by forcing the system in a given direction or by hiding their intentionality. Here we consider the 
financial market as a model system, and we study empirically how agents strategically adjust the 
properties of large orders in order to meet their preference and minimize their impact. We quantify 
' this strategic behavior by detecting scaling relations of allometric nature @ between the variables 

, characterizing the trading activity of different institutions. We observe power law distributions in 

the investment time horizon, in the number of transactions needed to execute a large order and in 
the traded value exchanged by large institutions and we show that heterogeneity of agents is a key 
ingredient for the emergence of some aggregate properties characterizing this complex system. 

■ PACS numbers: 

C-l-H ! In many complex systems agents self organize themselves in an ecology of different "species" interacting in a variety 
of ways. Agents are not only different in their strategies, information, andpreferences, but they can be very different 
in their size. Examples include individual's wealth and firms size The presence of agents with large size 

poses several challenging questions. It is likely that large agents impacts the system in a way that is significantly 
different from small ones. Indeed, small agents can easily hide their intentionality, while for large agents this is not 
so easy and they must adopt strategies taking into account their own effect because revealing their intention could 
decrease their fitness. 
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Financial markets are an ideal system to investigate this problem. There is empirical evidence that market par- 
{S| 1 ticipants are very heterogeneous in size. For example banks [7[ and mutual funds [8[ size follow Zipf's law, i.e. the 
^j* \ probability that the size of a participant is larger than x decays as 1/x [4]. As a consequence large investors usu- 
ally need to trade large quantities that can significantly affect prices. The associated cost is called market impact 
[lO, EH, E2L EH, EH, EE[ . For this reason large investors refrain from revealing their demand or supply and they typically 
trade their large orders incrementally over an extended period of time. These large orders are called packages [3, E3 
or hidden orders and are split in smaller trades as the result of a complex optimization procedure which takes into 
account the investor's preference, risk aversion, investment horizon, etc.. 

Here we investigate the trading activity of a large fraction of the financial firms exchanging a financial asset at 
the Spanish Stock Market (Bolsas y Mercados Espanoles, BME) in the period 2001-2004 (see Materials and Methods 
section for a description of data). Firms are credit entities and investment firms which are members of the stock 
exchange and are entitled to trade in the market. 

Our approach aims to be a comprehensive approach analysing the overall dynamics of all packages exchanged in 
the market. However, our database does not contain direct information on packages, so that this information must 
be statistically inferred from the available data. Since we do not have information on clients but only on firms, we 
develop a detection algorithm (see Material and Methods for a description of the algorithm) which is not sensible 
to small fluctuation in the buy /sell activity of a firm. The algorithm detects time segments in the inventory time 
evolution of a firm when the firm acts as a net buyer or seller at an approximately constant rate. We call these 
segments patches and we assume that in each of these patches it is contained at least one package. 

Since firms act simultaneously as brokers for many clients, it is rather frequent that in a patch not all the transactions 
have the same sign. However, a vast majority of firm inventory time series can be partitioned in patches with a well 
defined direction to buy or to sell. This is probably due to the fact that in most cases the trading activity of a firm is 
dominated by the activity of one big client. We consider directional patches, i.e. patches with a well defined direction 
(see Figure [1]). The characterizing variables of a directional patch are the time length T (in seconds) of the patch, 
measured as the time interval between the first and the last order of the patch, the traded value V m and the number 
N m of trades characterizing the patch. For example, N m is the number of buy trades and V m is the purchased value 
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FIG. 1: Example of an inventory time series. The series refers to a particular firm trading Santander. The vertical lines indicate 
the position where our algorithm predicts the boundary between two patches. The red patches are directional patches. Due to 
their statistical nature, in each patch there are buy (with a total traded value V&) and sell (with a total traded value V s ) trades. 
We consider directional patches, i.e. patches where either Vb/V > (buy patch) or V s /V > (sell patch), where V = V& + V s . 
For buy patches V m = Vb whereas for sell patches V m = V s . In the present study we set = 75%. Directional patches are 
shown as red lines. The black patches are not directional and are not considered in the rest of the paper. 



for buy patches. 

We investigate first the distributional properties of the patches identified by our algorithm. Figure [2] shows the 
distribution of T, 7V m , and V m for the three investigated stocks. The asymptotic behavior of all the three distributions 
can be approximated by a power law function P(X) ~ l/X Cx+1 , where X can be T, N m: or V m and (x is the exponent 
characterizing the power law behavior. A summary of the estimated exponents is shown in Table [J from which one can 
conclude that (y m — 2, (N m —1.8, and £t ~ 1.3. Our analysis makes explicit the presence of very broad distribution 
for the three variables characterizing a patch. In fact the very low value of the exponents is consistent with the 
conclusion that T and N m belong to the domain of Levy stable distributions. This result indicates that in the market 
there is a huge heterogeneity in the scales characterizing the trading profile of the investors. 

The volume of the packages is likely to be related to the size of the investors. Large investors need to trade large 
packages to rebalance their portfolio. Gabaix et al. [18] developed a theory which predicts that package size should 
be power law distributed with an exponent (y m =3/2. The value we find for (y m ~ 2. is slightly larger than the one 
predicted by them. On the contrary, the value £jv m = 3 derived by the theory in [18] is significantly larger than our 
estimate (0v m — 1-8). Finally, the power law distribution of packages time length T might reflect the heterogeneity of 
time scales among investors. The distribution of T is c omp atible with the ones obtained by using specialized database 
describing the investment packages of large investors [lfl [l3 ( see Figure [2)). Gabaix et al. theory [l8[ predicts the 
value Ct = 3 which is significantly larger than our value (£t ~ 1.3). The presence of power law distribution of 
investors time scales has been recently suggested in stylized models of investment decisions [H, [20|, [2l[ . 

The role of size heterogeneity in the emergence of power law distributions will be considered at the end of the paper. 
To complete our characterization of firm patches, we now consider the relation between the variables characterizing 
each patch. Specifically, by applying the Principal Component Analysis (PCA) to the set of points with coordinates 
(logT, log N m , \ogVm), we investigate the allometric relations between any two of the above variables, i.e. 

N m ~V£ T~V^ N m ~T°* (1) 

Figure [3] shows the scatter plots and the contour plots for the stock Telefonica. In all three cases a clear dependence 
between the variables is seen. PCA analysis shows that the first eigenvalue explains on average 91%, 83%, and 89% of 
the variance for the first, second, and third allometric relation, respectively, indicating a strong correlation between 
the variables. The estimated exponents (see Table |T]) are consistent for different stocks so that the allometric relations 
are 

Nm-V^ 1 T ~ V^ 9 N m ~T om (2) 

The presence of scaling relations between the variables were first suggested in Ref. [18] but it is worth noting that the 
theory developed in that paper predicts g 1 = g 2 = 1/2 and #3 = 1, and these values are quite different from the ones 
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FIG. 2: Distribution of T, iV m , and V m for the the stocks Banco Bilbao Vizcaya Argentaria (BBVA), Banco Santander Central 
Hispano (SAN), and Telefonica (TEF). In the panel showing the distribution of T we plot the distribution of packages reported 
in literature on packages. Specifically, blue circles are results from Ref. 0] for packages traded at the New York Stock Exchange 
and magenta squares are results from Ref. [17] for packages traded at the Australian Stock Exchange. 



TABLE I: Summary of the properties of detected patches. The number in parenthesis nearby the tick symbol is the number 
of patches detected for the considered stock. Rows 1-3: Tail exponents of the distribution of T, iV m , and Vm estimated with 
the Hill estimator (or Maximum Likelihood Estimator). In parenthesis we report the 95% confidence interval. Rows 4-6: 
Exponents of the allometric relations defined in Eq. [T] The exponents are estimated with PCA and the errors are estimated 
with bootstrap. In parenthesis we report the 95% confidence interval. Rows 7-9: Percentage of firms with at least 10 patches 
for which one cannot reject the hypothesis of lognormality with 95% confidence according to Jarque-Bera test. The numbers in 
parenthesis are the number of firms for which one cannot reject the hypothesis of lognormality divided to the number of firms 
used in the test. 





BBVA (2104) 


SAN (2086) 


TEF (2062) 


CiVm 

Ct 


2.3 (1.9; 2.7) 
2.0 (1.7; 2.3) 
1.5 (1.3; 1.7) 


2.0 (1.7; 2.3) 
1.7 (1.4; 2.0) 
1.5 (1.3; 1.7) 


1.9 (1.6; 2.2) 
1.7 (1.4; 2.0) 
1.2 (1.0; 1.4) 


9i 

92 
93 


1.08 (1.05; 1.12) 
1.81 (1.69; 1.93) 
0.68 (0.65; 0.71) 


1.06 (1.01; 1.10) 
1.81 (1.68; 1.94) 
0.68 (0.65; 0.70) 


1.07 (1.04; 1.11) 
2.00 (1.88; 2. 14) 
0.62 (0.59; 0.64) 


T 

Nm 
V m 


75 (15/20) 
90 (18/20) 
90 (18/20) 


63 (17/27) 
100 (27/27) 
100 (27/27) 


77 (24/31) 
100 (31/31) 
94 (29/31) 



we estimate from data. The first allometric relation indicates that the number of transactions in which a package is 
split is approximately proportional to the total traded value of the package. This implies that the mean transaction 
volume is roughly independent on the size of the package. This mean value is on average determined by the size of the 
available volume at the best quote indicating that the trader does not trade orders larger than the volume available 
at the best quote, probably to avoid being too aggressive [221 ]. 

We consider the relation between the three variables together by performing a PCA on the set of points describing 
the patches and identified by the coordinates (logT, log7V m , \ogV m ) [23]. The set of points effectively lies on a two 
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FIG. 3: Scatter plots of the variables T, N m , and V m for Telefonica. The black lines are contour lines of the bivariate 
probability density function. The insets show the probability density functions of the three exponents gi, #2, and gz describing 
the allometric relations of Eq. [T] computed on the patches of individual firms with at least 10 patches. The red vertical lines 
indicates the values of the scaling exponents computed in the pool of all firms and reported in rows 4-6 of Table ID It is worth 
noting that the dispersion of gi is significantly larger than the one for the other two exponents. 

dimensional manifold which has one dimension much larger than the other. The fact that the first eigenvalue is large 
indicates that one factor dominates the trading strategy. The allometric relations of the three variables associated 
with the first eigenvalue of the PC A provides an estimation of the exponents (gi ~ 1.2, #2 — 1-8, and gs ~ 0.67 for 
Telefonica) which, differently than in the bivariate case, are of course coherent among them and only slightly different 
from the ones obtained from the bivariate analysis. 

We now go back to the problem of assessing the role of firm heterogeneity. The first scientific question is: Is the 
fat tailed distribution of T, N m , and V m clue to the fact that individual firms place heterogeneously sized packages or 
is this an effect of the aggregation of many different firms together? To answer this question we test the hypothesis 
that the patches identified for a given firm trading a given stock are lognormally distributed. The test (see Table [J) 
shows that for most of the trading firms we cannot reject the hypothesis that the patches have characteristics sizes 
distributed as a lognormal. Since we reject the lognormal hypothesis for the pool obtained by considering all the firms, 
we conclude that the power law distribution of T, N m and V m is due to an heterogeneity in patch scale between different 
firms rather that within each firm. The second scientific question about concerns the role of firm heterogeneity for 
scaling laws. To assess the role of heterogeneity, for each firm we compute the exponents #2, and gs of the bivariate 
relations of Eq. [T] (see insets of fig. EJ). We observe that the exponents obtained for each firm are distributed around 
the corresponding value of the exponent obtained for the pool. This result indicates that the bivariate allometric 
relations are not an effect of the aggregation but are observed, on average, also for individual firms. 

In conclusion our comprehensive investigation of packages traded at BME shows that heterogeneity of firms has 
an essential role for the emergence of power law tails in the investment time horizon, in the number of transactions 
and in the traded value exchanged by packages. Differently, scaling laws between the variables characterizing each 
package are essentially the same across different firms with the possible exception of the relation between T and V m 
perhaps reflecting different degree of aggressiveness of firms. 
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I. MATERIALS AND METHODS 

Our database of the electronic open market SIBE (Sistema de Interconexion Bursatil Electronico) allows us to 
follow each transaction performed by all the firms registered at BME. In 2004 the BME was the eight in the world in 
market capitalization. We consider the stocks Banco Bilbao Vizcaya Argentaria (BBVA), Banco Santander Central 
Hispano (SAN), and Telefonica (TEF). We also consider only the most active firms defined by the criterion that each 
firm made at least 1, 000 trades/year and was active at least 200 days per year. The number of firms is 50 (BBVA), 55 
(SAN), and 61 (TEF). These firms are involved in 81 — 86% of the transactions. The investigated period is 2001-2004. 
We do not consider other stocks because we have verified that the number of detected patches is too small to perform 
careful statistical estimation. 

The series under study is the series of signed traded value. For each firm and for each stock we construct the series 
composed by all the trades performed by the firm with a value +v for a buy trade and — v for a sell trade, where v is 
the value (in Euros) of the traded shares. 

The method we use to detect statistically the presence of patches is adapted from Ref. [24| where it was introduced 
to study patchiness non-stationarity of human heart rate. The algorithm works as follows. One moves a sliding 
pointer along the signal and computes the mean of the subset of the signal to the left and to the right of the pointer. 
From these mean values one computes a t statistics and finds the position of the pointer for which the t statistics is 
maximal. The significance level of this value of t is defined as the probability of obtaining it or a smaller value in a 
random sequence. One then chooses a threshold (in our case 99%) and the sequence is cut if the significance level 
is smaller than the threshold. The cut position is the boundary between two consecutive patches. The procedure 
continues recursively on the left and right subset created by each cut. Before a new cut is accepted one also computes 
t between the right-hand new segment and its right neighbor and t between the left-hand new segment and its left 
neighbor and one checks if both values of t are statistically significant according to the selected threshold. The process 
stops when it is not possible to make new cut with the selected significance. 

In the present study, we are mainly interested in directional patches, i.e. patches where the trader consistently 
buys or sells a large amount of shares. In other words we wish to exclude patches in which the inventory of the firm 
is diffusing randomly, without a drift. To this end for each patch we compute the total value purchased V5, the total 
value sold V s and the total value V = Vb + V s . We then consider a patch as directional when either Vb/V > (buy 
patch) or V s /V > (sell patch). The parameter 6 can be varied and in the present study we set it to 6 = 75%. We 
obtain similar results for different values of 6 such as 85% and 95%. Finally in the present paper we consider patches 
with at least 10 trades. 
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